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VERIFIABLE  AND  COMPUTABLE  PERFORMANCE  ANALYSIS  OF 

SPARSITY  RECOVERY* 

GONGGUO  TANG1"  AND  ARYE  NEHORAI* 

Abstract.  In  this  paper,  we  develop  verifiable  and  computable  performance  analysis  of  sparsity 
recovery.  We  define  a  family  of  goodness  measures  for  arbitrary  sensing  matrices  as  a  set  of  optimiza¬ 
tion  problems,  and  design  algorithms  with  a  theoretical  global  convergence  guarantee  to  compute 
these  goodness  measures.  The  proposed  algorithms  solve  a  series  of  second-order  cone  programs,  or 
linear  programs.  As  a  by-product,  we  implement  an  efficient  algorithm  to  verify  a  sufficient  condition 
for  exact  sparsity  recovery  in  the  noise-free  case.  We  derive  performance  bounds  on  the  recovery 
errors  in  terms  of  these  goodness  measures.  We  also  analytically  demonstrate  that  the  developed 
goodness  measures  are  non-degenerate  for  a  large  class  of  random  sensing  matrices,  as  long  as  the 
number  of  measurements  is  relatively  large.  Numerical  experiments  show  that,  compared  with  the 
restricted  isometry  based  performance  bounds,  our  error  bounds  apply  to  a  wider  range  of  problems 
and  are  tighter,  when  the  sparsity  levels  of  the  signals  are  relatively  low. 
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1.  Introduction. 

Sparse  signal  recovery  (or  compressive  sensing)  has  revolutionized  the  way  we 
think  of  signal  sampling  [11].  It  goes  far  beyond  sampling  and  has  also  been  applied 
to  areas  as  diverse  as  medical  imaging,  remote  sensing,  radar,  sensor  arrays,  image 
processing,  computer  vision,  and  so  on.  Mathematically,  sparse  signal  recovery  aims 
to  reconstruct  a  sparse  signal,  namely  a  signal  with  only  a  few  non-zero  components, 
from  usually  noisy  linear  measurements: 

y  =  Ax  +  w ,  (1-1) 

where  x  £  R"  is  the  sparse  signal,  y  £  Rm  is  the  measurement  vector,  A  £  Rmxn  is 
the  sensing/measurement  matrix,  and  w  £  is  the  noise.  A  theoretically  justified 
way  to  exploit  the  sparseness  in  recovering  x  is  to  minimize  its  norm  under  certain 
constraints  [9]. 

In  this  paper,  we  investigate  the  problem  of  using  the  t ^  norm  as  a  performance 
criterion  for  sparse  signal  recovery  via  minimization.  Although  the  £2  norm  has  been 
used  as  the  performance  criterion  by  the  majority  of  published  research  in  sparse  signal 
recovery,  the  adoption  of  the  £oo  norm  is  well  justified.  Other  popular  performance 
criteria,  such  as  the  l\  and  £2  norms  of  the  error  vectors,  can  all  be  expressed  in 
terms  of  the  £oo  norm  in  a  tight  and  non-trivial  manner.  More  importantly,  the  £ao 
norm  of  the  error  vector  has  a  direct  connection  with  the  support  recovery  problem. 
To  see  this,  assuming  we  know  a  priori  the  minimal  non-zero  absolute  value  of  the 
components  of  the  sparse  signal,  then  controlling  the  1^  norm  within  half  of  that 
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value  would  guarantee  exact  recovery  of  the  support.  Support  recovery  is  arguably 
one  of  the  most  important  and  challenging  problems  in  sparse  signal  recovery.  In 
practical  applications,  the  support  is  usually  physically  more  significant  than  the 
component  values.  For  example,  in  radar  imaging  using  sparse  signal  recovery,  the 
sparsity  constraints  are  usually  imposed  on  the  discretized  time-frequency  domain. 
The  distance  and  velocity  of  a  target  have  a  direct  correspondence  to  the  support  of 
the  sparse  signal.  The  magnitude  determined  by  coefficients  of  reflection  is  of  less 
physical  significance  [1,18,19].  Refer  to  [27]  for  more  discussions  on  sparse  support 
recovery. 

Another,  perhaps  more  important,  reason  to  use  the  norm  as  a  performance 
criterion  is  the  verifiability  and  computability  of  the  resulting  performance  bounds. 
A  general  strategy  to  study  the  performance  of  sparse  signal  recovery  is  to  define  a 
measure  of  the  goodness  of  the  sensing  matrix,  and  then  derive  performance  bounds  in 
terms  of  the  goodness  measure.  The  most  well-known  goodness  measure  is  undoubt¬ 
edly  the  restricted  isometry  constant  (RIC)  [7].  Upper  bounds  on  the  £2  and  norms 
of  the  error  vectors  for  various  recovery  algorithms  have  been  expressed  in  terms  of 
the  RIC.  Unfortunately,  it  is  extremely  difficult  to  verify  that  the  RIC  of  a  specific 
sensing  matrix  satisfies  the  conditions  for  the  bounds  to  be  valid,  and  even  more  dif¬ 
ficult  to  directly  compute  the  RIC  itself.  Actually,  the  only  known  sensing  matrices 
with  nice  RICs  are  certain  types  of  random  matrices  [20].  By  using  the  norm 
as  a  performance  criterion,  we  develop  a  framework  in  which  a  family  of  goodness 
measures  for  the  sensing  matrices  are  verifiable  and  computable.  The  computability 
further  justifies  the  connection  of  the  norm  with  the  support  recovery  problem, 
since  for  the  connection  described  in  the  previous  paragraph  to  be  practically  useful, 
we  must  be  able  to  compute  the  error  bounds  on  the  norm. 

The  verifiability  and  computability  open  doors  for  wide  applications.  In  many 
practical  applications  of  sparse  signal  recovery,  e.g.,  radar  imaging  [24],  sensor  arrays 
[22],  DNA  microarrays  [25],  and  MRI  [21],  it  is  beneficial  to  know  the  performance 
of  the  sensing  system  before  its  implementation  and  the  taking  of  measurements. 
In  addition,  in  these  application  areas,  we  usually  have  the  freedom  to  optimally 
design  the  sensing  matrix.  For  example,  in  MRI  the  sensing  matrix  is  determined  by 
the  sampling  trajectory  in  the  Fourier  domain;  in  radar  systems  the  optimal  sensing 
matrix  design  is  connected  with  optimal  waveform  design,  a  central  topic  of  radar 
research.  To  optimally  design  the  sensing  matrix,  we  need  to 

1.  analyze  how  the  performance  of  recovering  x  from  y  is  affected  by  A,  and 
define  a  function  u>(A)  to  accurately  quantify  the  goodness  of  A  in  the  context 
of  sparse  signal  reconstruction; 

2.  develop  algorithms  to  efficiently  verify  that  ui(A)  satisfies  the  conditions  for 
the  bounds  to  hold,  as  well  as  to  efficiently  compute  ui(A)  for  arbitrarily  given 

A; 

3.  design  mechanisms  to  select  within  a  matrix  class  the  sensing  matrix  that  is 
optimal  in  the  sense  of  best  w(A). 

In  this  paper,  we  successfully  address  the  first  two  points  in  the  performance 
analysis  framework. 

We  now  preview  our  contributions.  First  of  all,  we  propose  using  the  norm  as 
a  performance  criterion  for  sparse  signal  recovery  and  establish  its  connections  with 
other  performance  criteria.  We  define  a  family  of  goodness  measures  of  the  sensing 
matrix,  and  use  them  to  derive  performance  bounds  on  the  i ^  norm  of  the  recovery 
error  vector.  Performance  bounds  using  other  norms  are  expressed  using  the  l 
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norm.  Numerical  simulations  show  that  these  bounds  are  tighter  than  the  RIC  based 
bounds  when  the  sparsity  levels  of  the  signals  are  relatively  small.  Secondly  and  most 
importantly,  using  fixed  point  theory,  we  develop  algorithms  to  efficiently  compute 
the  goodness  measures  for  given  sensing  matrices  by  solving  a  series  of  second-order 
cone  programs  or  linear  programs,  depending  on  the  specific  goodness  measure  being 
computed.  We  analytically  demonstrate  the  algorithms’  convergence  to  the  global 
optima  from  any  initial  point.  As  a  by-product,  we  obtain  a  fast  algorithm  to  verify  the 
sufficient  condition  guaranteeing  exact  sparse  recovery  via  t\  minimization.  Finally, 
we  show  that  the  goodness  measures  are  non-degenerate  for  subgaussian  and  isotropic 
random  sensing  matrices  as  long  as  the  number  of  measurements  is  relatively  large,  a 
result  parallel  to  that  of  the  RIC  for  random  matrices. 

Several  attempts  have  been  made  to  address  the  verifiability  and  computability 
of  performance  analysis  for  sparse  signal  recovery,  mainly  based  on  the  RIC  [7, 9]  and 
the  Null  Space  Property  (NSP)  [13].  Due  to  the  difficulty  of  explicitly  computing  the 
RIC  and  verifying  the  NSP,  researchers  use  relaxation  techniques  to  approximate  these 
quantities.  Examples  include  semi-definite  programming  relaxation  [14, 15]  and  linear 
programming  relaxation  [20].  To  the  best  of  the  authors’  knowledge,  the  algorithms 
of  [14]  and  [20]  represent  state-of-the-art  techniques  in  verifying  the  sufficient  condition 
of  unique  t\  recovery.  In  this  paper,  we  directly  address  the  computability  of  the 
performance  bounds.  More  explicitly,  we  define  the  goodness  measures  of  the  sensing 
matrices  as  optimization  problems  and  design  efficient  algorithms  with  theoretical 
convergence  guarantees  to  solve  the  optimization  problems.  An  algorithm  to  verify 
a  sufficient  condition  for  exact  t\  recovery  is  obtained  only  as  a  by-product.  Our 
implementation  of  the  algorithm  performs  orders  of  magnitude  faster  than  the  state- 
of-the-art  techniques  in  [14]  and  [20],  consumes  much  less  memory,  and  produces 
comparable  results. 

The  paper  is  organized  as  follows.  In  Section  2,  we  introduce  notations,  and  we 
present  the  measurement  model,  three  convex  relaxation  algorithms,  and  the  sufficient 
and  necessary  condition  for  exact  l\  recovery.  In  section  3,  we  derive  performance 
bounds  on  the  norms  of  the  recovery  errors  for  several  convex  relaxation  algo¬ 
rithms.  In  Section  4,  we  design  algorithms  to  verify  a  sufficient  condition  for  exact  l\ 
recovery  in  the  noise-free  case,  and  to  compute  the  goodness  measures  of  arbitrarily 
given  sensing  matrices.  Section  5  is  devoted  to  the  probabilistic  analysis  of  our 
performance  measures.  We  evaluate  the  algorithms’  performance  in  Section  6.  Section 
7  summarizes  our  conclusions. 

2.  Notations,  Measurement  Model,  and  Recovery  Algorithms.  In  this 
section,  we  introduce  notations  and  the  measurement  model,  and  review  recovery 
algorithms  based  on  l\  minimization. 

For  any  vector  x  £  R”,  the  norm  ||a;||k  i  is  the  summation  of  the  absolute  values  of 
the  k  (absolutely)  largest  components  of  x.  In  particular,  the  norm  Halloo  =  || a: || 
and  the  t\  norm  || sc|| i  =  ||a:||„,i.  The  classical  inner  product  in  R"  is  denoted  by  (•,•), 
and  the  £2  (or  Euclidean)  norm  is  || a;|| 2  =  \J {x,  x).  We  use  ||  •  ||0  to  denote  a  general 
norm. 

The  support  of  x,  supp(a;),  is  the  index  set  of  the  non-zero  components  of  x.  The 
size  of  the  support,  usually  denoted  by  the  £q  “norm”  ||a;||o,  is  the  sparsity  level  of  x. 
Signals  of  sparsity  level  at  most  k  are  called  k— sparse  signals.  If  S  C  {1,  •  •  •  ,  n}  is  an 
index  set,  then  |5|  is  the  cardinality  of  S,  and  xs  £  R^  is  the  vector  formed  by  the 
components  of  x  with  indices  in  S. 

We  use  e*,  0,  O,  and  1  to  denote  respectively  the  ith  canonical  basis  vector,  the 
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zero  column  vector,  the  zero  matrix,  and  the  column  vector  with  all  ones. 

Suppose  a;  is  a  k— sparse  signal.  In  this  paper,  we  observe  x  through  the  following 
linear  model: 


y  —  Ax  +  w,  (2.1) 

where  A  £  Rmx™  js  measurement/sensing  matrix,  y  is  the  measurement  vector, 
and  w  is  noise. 

Many  algorithms  have  been  proposed  to  recover  x  from  y  by  exploiting  the  sparse¬ 
ness  of  x.  We  focus  on  three  algorithms  based  on  l\  minimization:  the  Basis  Pur¬ 
suit  [12],  the  Dantzig  selector  [10],  and  the  LASSO  estimator  [29]. 


Basis  Pursuit:  min  \\z  h  s.t.  y  —  Az  L  <  £ 

(2.2) 

zeRn 

Dantzig:  min  \\z  h  s.t.  \\AT(y  -  Az)  U  <  u 

(2.3) 

zeKn 

LASSO:  min  -|| y  -  Az\\%  +  y||z||i. 

zGtn  Z 

(2.4) 

Here  y  is  a  tuning  parameter,  and  £  is  a  measure  of  the  noise  level.  All  three  opti¬ 
mization  problems  have  efficient  implementations  using  convex  programming  or  even 
linear  programming. 

In  the  noise-free  case  where  w  =  0,  roughly  speaking  all  the  three  algorithms 
reduce  to 


min  llzlli  s.t.  Az  =  Ax,  (2.5) 

zgl"  11  11  v 

which  is  the  l\  relaxation  of  the  NP  hard  minimization  problem: 

min  ||z||o  s.t.  Az  =  Ax.  (2.6) 

z£  1" 

A  minimal  requirement  on  l\  minimization  algorithms  is  the  uniqueness  and  ex- 

def 

actness  of  the  solution  x  =  argminz :  4 z = ^ x 1 1 .x | |  i ,  i.e.,  x  =  x.  When  the  true  signal  x 
is  k— sparse,  the  sufficient  and  necessary  condition  for  exact  i\  recovery  is  [16,17,30] 

H  W’Vz  G  Ker(A),  \S\  <  k,  (2.7) 

ies  i£s 

where  Ker(A)  '=  {z  :  Az  —  0}  is  the  kernel  of  A,  and  S  C  {1, . . . ,  n}  is  an  index  set. 
Expressed  in  terms  of  ||  •  H^i,  the  necessary  and  sufficient  condition  becomes 

\\z\\k,i  <  \\\zh^z  G  Ker(A).  (2.8) 

The  approaches  in  [20]  and  [14]  for  verifying  the  sufficient  condition  (2.8)  are 
based  on  relaxing  the  following  optimization  problem  in  various  ways: 


ak  =  max  ||z||fc  i  s.t.  Hz  =  0,  ||z||i  <  1.  (2.9) 

Z 

Clearly,  ak  <  1/2  is  necessary  and  sufficient  for  exact  recovery  for  k— sparse  signals. 
Unfortunately,  the  direct  computation  of  (2.9)  for  general  k  is  extremely  difficult:  it 


PERFORMANCE  ANALYSIS  OF  SPARSITY  RECOVERY 


5 


is  the  maximization  of  a  norm  (convex  function)  over  a  polyhedron  (convex  set)  [4]. 
In  [20],  in  a  very  rough  sense  a\  was  computed  by  solving  n  linear  programs: 

min  ||ej  —  ATyi\\00,i  =  1, •  •  •  ,  n,  (2.10) 

where  e,  is  the  ith  canonical  basis  in  R".  This,  together  with  the  observation  that 
as,  <  ka i,  yields  an  efficient  algorithm  to  verify  (2.8).  However,  in  [26],  we  found  that 
the  primal-dual  method  of  directly  solving  (2.8)  as  the  following  n  linear  programs 

rnaxZi  s.t.  Az  =  0,  ||z||i  <  1  (2-11) 

gives  rise  to  an  algorithm  orders  of  magnitude  faster.  In  the  next  section,  we  will 
see  how  the  computation  of  a.\  arises  naturally  in  the  context  of  performance 
evaluation. 

3.  Performance  Bounds  on  the  Norms  of  the  Recovery  Errors.  In 

this  section,  we  derive  performance  bounds  on  the  norms  of  the  error  vectors.  We 
first  establish  a  theorem  characterizing  the  error  vectors  for  the  t\  recovery  algorithms, 
whose  proof  is  given  in  Appendix  8.1 

Proposition  3.1.  Suppose  x  in  (2.1)  is  k—  sparse  and  the  noise  w  satisfies 
||io||o  <  s,  ||ATtc||00  <  /x,  and  ||ATic||00  <  k/x,  k  €  (0,1),  for  the  Basis  Pursuit,  the 
Dantzig  selector,  and  the  LASSO  estimator,  respectively.  Define  h  =  x  —  x  as  the 
error  vector  for  any  of  the  three  t\  recovery  algorithms  (2.2),  (2.3),  and  (2.4).  Then 
we  have 


c||/»|U,i  >  ||h||i,  (3.1) 

where  c  =  2  for  the  Basis  Pursuit  and  the  Dantzig  selector,  and  c  =  2/(1  —  k)  for  the 
LASSO  estimator. 

An  immediate  corollary  of  Proposition  3.1  is  to  bound  the  t\  and  £2  norms  of  the 
error  vector  using  the  t'oo  norm: 

Corollary  3.2.  Under  the  assumptions  of  Proposition  3.1,  we  have 

|j/i||i  <  ckWhWoo, 

\\h\\2  <  v'cfcUhlloo 

Furthermore,  if  S  =  supp(a;)  and  ft  =  min,S5  |®j|,  then  ||/x.||oo  <  /3/2  implies 

supp(max(|x|  —  ft/ 2,0)  =  supp(cc),  (3.4) 


(3.2) 

(3.3) 


i.e.,  a  thresholding  operator  recovers  the  signal  support. 

For  ease  of  presentation,  we  have  the  following  definition: 

Definition  3.3.  For  any  real  number  s  G  [l,n]  and  matrix  A  e  Rmxn,  define 


ui0(Q,s) 


\\Qz\\o 

mm  - 

z:IM|l/llzl|oo<S  Halloo 


(3.5) 


where  Q  is  either  A  or  AT A. 

Now  we  present  the  error  bounds  on  the  norm  of  the  error  vectors  for  the 
Basis  Pursuit,  the  Dantzig  selector,  and  the  LASSO  estimator. 

Theorem  3.4.  Under  the  assumption  of  Proposition  3.1,  we  have 

2e 


ll'h  ||oo 


w0(A,  2k) 


(3.6) 
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for  the  Basis  Pursuit, 

l,&_  3,1)00  -  Uoo(ATA,2k) 

for  the  Dantzig  selector,  and 

\\x-x\\  <  _ (1  +  _ 

for  the  LASSO  estimator. 

Proof.  Observe  that  for  the  Basis  Pursuit 

Il^(*  -  *)||2  <  I \y  -  Ax\\2  +  II y  -  Ax ||2 

<  e  +  \\Aw\\2 

<  2c, 


and  similarly, 


\\AtA{x  -  aJUoo  <  2/i 


(3.7) 


(3.8) 


(3.9) 

(3.10) 


for  the  Dantzig  selector,  and 

\\ATA(x  -  £c)||oo  <  (1  +  K)fjL  (3.11) 

for  the  LASSO  estimator.  The  conclusions  of  Theorem  3.4  follow  from  equations  (3.2), 
(3.3),  and  Definition  3.3.  □ 

One  of  the  primary  contributions  of  this  work  is  the  design  of  algorithms  that 
compute  ui0(A,s)  and  u}oa(ATA,s)  efficiently.  The  algorithms  provide  a  way  to  nu¬ 
merically  assess  the  performance  of  the  Basis  Pursuit,  the  Dantzig  selector,  and  the 
LASSO  estimator  according  to  the  bounds  given  in  Theorem  3.4.  According  to  Corol¬ 
lary  3.2,  the  correct  recovery  of  signal  support  is  also  guaranteed  by  reducing  the 
norm  to  some  threshold.  In  Section  5,  we  also  demonstrate  that  the  bounds  in  The¬ 
orem  3.4  are  non-trivial  for  a  large  class  of  random  sensing  matrices,  as  long  as  m  is 
relatively  large.  Numerical  simulations  in  Section  6  show  that  in  many  cases  the  error 
bounds  on  the  £ 2  norms  based  on  Corollary  3.2  and  Theorem  3.4  are  tighter  than  the 
RIC  based  bounds.  We  expect  the  bounds  on  the  norms  in  Theorem  3.4  are  even 
tighter,  as  we  do  not  need  the  relaxation  in  Corollary  3.2. 

We  note  that  a  prerequisite  for  these  bounds  to  be  valid  is  the  positiveness  of  the 
involved  w0(-)-  We  call  the  validation  of  w0(-)  >  0  the  verification  problem.  Note  that 
from  Theorem  3.4,  w<>(-)  >  0  implies  the  exact  recovery  of  the  true  signal  x  in  the 
noise-free  case.  Therefore,  verifying  w<>(-)  >  0  is  equivalent  to  verifying  a  sufficient 
condition  for  exact  £\  recovery. 

4.  Verification  and  Computation  of  w<>.  In  this  section,  we  present  algo¬ 
rithms  for  verification  and  computation  of  oj<>(-).  We  will  present  a  very  general 
algorithm  and  make  it  specific  only  when  necessary.  For  this  purpose,  we  use  Q  to 
denote  either  A  or  AT A,  and  use  ||  •  ||0  to  denote  a  general  norm. 

4.1.  Verification  of  >  0.  Verifying  u)0(Q,s)  >  0  amounts  to  making  sure 
ll^lli/ll^lloo  <  s  for  all  z  such  that  Qz  =  0.  Equivalently,  we  can  compute 

s*  =  min  j)z))l  s.t.  Qz  =  0. 


OO 


(4.1) 
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Then,  when  s  <  s*,  we  have  cu<>((3,  s)  >  0.  We  rewrite  the  optimization  (4.1)  as 

—  =  max  Halloo  s.t.  Qz  =  0,  ||z||i  <  1, 
s*  z 

which  is  solved  using  the  following  n  linear  programs: 

maxzj  s.t.  Qz  =  0,  || || !  <  1. 

Z 


(4.2) 


(4.3) 


The  dual  problem  for  (4.3)  is 


mm  ||e»  —  QTA||00,  (4.4) 

where  e,;  is  the  ith  canonical  basis  vector. 

We  solve  (4.3)  using  the  primal-dual  algorithm  expounded  in  Chapter  11  of  [5], 
which  gives  an  implementation  much  more  efficient  than  the  one  for  solving  its  dual 
(4.4)  in  [20].  This  method  is  also  used  to  implement  the  i\  MAGIC  for  sparse  signal 
recovery  [6].  Due  to  the  equivalence  of  ATAz  =  0  and  Az  =  0,  we  always  solve 
(4.2)  for  Q  =  A  and  avoid  Q  =  AT A.  The  former  apparently  involves  solving  linear 
programs  of  smaller  size.  In  practice,  we  usually  replace  A  with  the  matrix  with 
orthogonal  rows  obtained  from  the  economy-size  QR  decomposition  of  AT . 

As  a  dual  of  (4.4),  (4.3)  (and  hence  (4.2)  and  (4.1))  shares  the  same  limitation 
as  (4.4),  namely,  it  verifies  w<>  >  0  only  for  s  up  to  2\j2m.  We  now  reformulate 
Proposition  4  of  [20]  in  our  framework: 

Proposition  4.1.  [20,  Proposition  4]  For  any  m  x  n  matrix  A  with  n  >  32 m, 
one  has 


s*  =  min 


IZI^'-Qz  =  o}<2V2^. 
zl|oo  J 


(4.5) 


4.2.  Computation  of  u>0.  Now  we  turn  to  one  of  the  primary  contributions  of 
this  work,  the  computation  of  ct><>-  The  optimization  problem  is  as  follows: 


w<>(Q,s) 


=  min 

z 


\\QZ  llo 

Nloo 


s.t. 


or  equivalently, 


1 

w0(Q,s) 


max  ||;z||oo  s.t. 

Z 


IIQzllo  <  1, 


<  S. 


(4.6) 


(4.7) 


We  will  show  that  1  /<*;<>  (Q,  s)  is  the  unique  fixed  point  of  certain  scalar  function. 
To  this  end,  we  define  functions  fs,i{v)i  *  =  1, . . . ,  n  and  over  [0,  00)  parameter¬ 
ized  by  s  £  (1,  s*): 


=  max {zi  :  \\Qz\\0  <  1,  ||z||i  <  sr?} 

Z 

=  max {|zj|  :  ||Qz||o  <  1,  ||z||i  <  s?;}  , 

Z 


(4.8) 
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since  the  domain  for  the  maximization  is  symmetric  to  the  origin,  and 
fs(i l)  =f  maxlll^Hoo  :  ||Qz||0  <  1,  ||z||i  <  sry} 

Z 

—  max  max  \zA 

z:||Qz||o<l  * 

ll*l|i<s»7 

=  max  max  I  zA 

i  z:||Qz||o<l 
||z||i<sr; 

=  max  /s,i  (77) ,  (4.9) 

i 

where  for  the  last  but  one  equality  we  have  exchanged  the  two  maximizations.  For 
77  >  0 ,  it  is  easy  to  show  that  strong  duality  holds  for  the  optimization  problem 
defining  As  a  consequence,  we  have  the  dual  form  of  fs,i{v): 

fs,i(v )  =  min sr)\\ei  -  +  ||A||*,  (4.10) 

where  ||  •  \\l  is  the  dual  norm  of  ||  •  ||0. 

In  the  definition  of  /s(r?),  we  basically  replaced  the  Halloo  in  the  denominator  of 
the  fractional  constraint  in  (4.7)  with  ?y.  The  following  theorem  states  that  the  unique 
positive  fixed  point  of  fs(ji)  is  exactly  l/w<>(Q,  s).  See  Appendix  8.2  for  the  proof. 
Theorem  4.2.  The  functions  fs,i(jf)  and  fs(v)  have  the  following  properties: 

1.  fs,i(v )  and  fs{r])  are  continuous  in  77; 

2.  fs,i(i /)  and  fs{r])  are  strictly  increasing  in  rj; 

3.  fs,i(v)  is  concave  for  every  i; 

4-  fs(0)  =  0,  fs(r/)  >  s?y  >  77  for  sufficiently  small  77  >  0,  and  there  exists  p  <  1 
such  that  f si^l)  <  PV  for  sufficiently  large  77;  the  same  holds  for  fs,i{v)> 

5.  fSti  and  fs{ri)  have  unique  positive  fixed  points  77*  =  /s,?:( 77*)  and  rf  =  fs{jf), 
respectively;  and  rf  =  max;  77* ; 

6.  The  unique  positive  fixed  point  of  fs{if),  77*,  is  equal  to  1/oj<>(Q,s); 

7.  For  77  £  (0,77*),  we  have  fs{jl)  >  77;  and  for  77  £  (77*,  oo),  we  have  /s (77)  <  77; 
the  same  statement  holds  also  for  fs,i(v)- 

8.  For  any  e  >  0,  there  exists  pi(e)  >  1  such  that  fs{r])  >  Pi(e)r]  as  long  as 
0  <  77  <  (1  —  e)rf ;  and  there  exists  /32(e)  <  1  such  that  fs(jl)  <  P2(.r)r]  as  long 
as  77  >  (1  +  e)r/*. 

Theorem  4.2  implies  three  ways  to  compute  the  fixed  point  of  77*  =  l/wo(Q,s) 
for  /s( 77). 

1.  Naive  Fixed  Point  Iteration:  Property  8)  of  Theorem  4.2  suggests  that 
the  fixed  point  iteration 


7?t+i  =  fs{Vt),t  =  0, 1,...  (4.11) 

starting  from  any  initial  point  770  >  0  converges  to  77*,  no  matter  770  <  rf  or 
?7o  >  rj* .  The  algorithm  can  be  made  more  efficient  in  the  case  770  <  rf .  More 
specifically,  since  fs(jl)  =  max;  /S)i (77) ,  at  each  fixed  point  iteration,  we  set 
?7t+i  to  be  the  first  fs,i(jlt)  that  is  greater  than  rjt  +  e  with  e  some  tolerance 
parameter.  If  for  all  i,  fs,i{rh)  <  Vt  +  e,  then  fs(r]t)  =  max;  fs,i{rit)  <  Pt  +  e, 
which  indicates  the  optimal  function  value  can  not  be  improved  greatly  and 
the  algorithm  should  terminate.  In  most  cases,  to  get  r/t+i,  we  need  to  solve 
only  one  optimization  problem  maxzZj  :  HQ^Ho  <  l,||z||i  <  siy  instead  of 
n.  This  is  in  contrast  to  the  case  where  770  >  77*,  because  in  the  later  case  we 
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Fig.  4.1:  Illustration  of  the  naive  fixed  point  iteration  (4.11)  when  o  =  oo. 


must  compute  all  )  to  update  rjt+ i  =  maxj /S)j(ryt).  An  update  based 

on  a  single  might  generate  a  value  smaller  than  rj*. 

In  Figure  4.1,  we  illustrate  the  behavior  of  the  naive  fixed  point  iteration 
algorithm  (4.11).  These  figures  are  generated  by  Matlab  for  a  two  dimen¬ 
sional  problem.  We  index  the  sub-figures  from  left  to  right  and  from  top 
to  bottom.  The  first  (upper  left)  sub-figure  shows  the  star-shaped  region 
S  =  {z  :  HQzIloo  <  1,  ||z||i/||z||oo  <  s}.  Starting  from  an  initial  77 0  <  77*,  the 
algorithm  solves 


max  Halloo  s.t.  HQzHo  <  1,  ||2||i  <  s?7o  (4.12) 

Z 

in  sub-figure  2.  The  solution  is  denoted  by  the  black  dot.  Although  the  true 
domain  for  the  optimization  in  (4.12)  is  the  intersection  of  the  distorted  £ao 
ball  {z  :  1 1 Q z  \ | 00  <  1}  and  the  l\  ball  {z  :  ||z||i  <  s?yo},  the  intersection  of  the 
£1  ball  (light  gray  diamond)  and  the  star-shaped  region  S  forms  the  effective 
domain,  which  is  the  dark  grey  region  in  the  sub-figures.  To  see  this,  we 
note  the  optimal  value  of  the  optimization  (4.12)  rji  =  =  fs(ij 0)  >  rj0 

according  to  7)  of  Theorem  4.2,  implying  that,  for  the  optimal  solution 
we  have  ||**||i/||*i||oo  <  ||*il|i/%  <  s.  Therefore,  the  optimal  solution  x\ 
can  always  be  found  in  the  dark  grey  region.  In  the  following  sub-figures, 
at  each  iteration,  we  expand  the  i\  ball  until  we  get  to  the  tip  point  of  the 
star-shaped  region  S,  which  is  the  global  optimum. 

Despite  of  its  simplicity,  the  naive  fixed  point  iteration  has  two  major  dis¬ 
advantages.  Firstly,  the  stopping  criterion  based  on  successive  improvement 
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is  not  accurate  as  it  does  not  reflect  the  gap  between  rjt  and  if.  This  dis¬ 
advantage  can  be  remedied  by  starting  from  both  below  and  above  rf .  The 
distance  between  corresponding  terms  in  the  two  generated  sequences  is  an 
indication  of  the  gap  to  the  fixed  point  77* .  However,  the  resulting  algorithm 
is  generally  slow,  especially  when  updating  rit+i  from  above  77*.  Secondly, 
the  iteration  process  is  slow  when  close  to  the  fixed  point  if .  This  is  because 
Pi(e)  and  772(e)  in  8)  of  Theorem  4.2  are  close  to  1  for  small  e  >  0. 

2.  Bisection:  The  bisection  approach  is  motivated  by  property  7)  of  Theorem 
4.2.  Starting  from  an  initial  interval  (?7l,t7u)  that  contains  77*,  we  compute 
fa(v  m)  with  ?7m  =  (7?l+77u)/2.  Asa  consequence  of  property  7),  /s(?7m)  >  ?7m 
implies  fs(r jm)  <  rf ,  and  we  set  ?7l  =  fa( t?m);  fa(r 7m)  <  Vm  implies  /s(?7m)  > 
77*,  and  we  set  vu  =  fs(i 7m)-  The  bisection  process  can  also  be  accelerated  by 
setting  77L  =  fs,i( Vm)  for  the  first  fs,i{vM )  greater  than  vm-  The  convergence 
of  the  bisection  approach  is  much  faster  than  the  naive  fixed  point  iteration 
because  each  iteration  reduces  the  interval  length  at  least  by  half.  In  addition, 
half  the  length  of  the  interval  is  an  upper  bound  on  the  gap  between  i)m  and 
77*,  resulting  an  accurate  stopping  criterion.  However,  if  the  initial  vv  is  too 
larger  than  77*,  the  majority  of  /s(?7m)  would  turn  out  to  be  less  than  77*.  The 
verification  of  /s(?7m)  <  Vm  needs  solving  n  linear  programs  or  second-order 
cone  programs,  greatly  degrading  the  algorithm’s  performance. 

3.  Fixed  Point  Iteration  +  Bisection:  The  third  approach  combines  the 
advantages  of  the  bisection  method  and  the  fixed  point  iteration  method, 
at  the  level  of  /s>i( 77).  This  method  relies  on  the  representation  fs( 77)  = 
max,;  fs,i{v)  and  77*  =  max,;  77*. 

Starting  from  an  initial  interval  (771.0,  Vu)  and  the  index  set  Xq  =  {1, . . . ,  n}, 
we  pick  any  zo  €  Xq  and  use  the  (accelerated)  bisection  method  with  starting 
interval  (vlo,Vu)  to  find  the  positive  fixed  point  vt0  of  fs,i0(v)-  For  any 
i  €  Xo/i0,  fs,i(Vi0 )  <  Vi0  implies  that  the  fixed  point  77*  of  fs,i{v)  is  less 
than  or  equal  to  vt0  according  to  the  continuity  of  fSti(v)  and  the  uniqueness 
of  its  positive  fixed  point.  As  a  consequence,  we  remove  this  i  from  the 
index  set  X0.  We  denote  X\  as  the  index  set  after  all  such  is  removed,  i.e., 
X\  =  X0/{i  :  fs,i(Vi0 )  <  V*0}-  We  then  set  7Ai  =  71*0  as  77*  >  v*0 ■  Next  we  test 
the  ?’i  £  X\  with  the  largest  fs,i{Vi0)  and  construct  X2  and  Vh2  in  a  similar 
manner.  We  repeat  the  process  until  the  index  set  Xt  is  empty.  The  77*  found 
at  the  last  step  is  the  maximal  77* ,  which  is  equal  to  if  ■ 

Note  that  in  equations  (4.6),  (4.7),  and  (4.9),  if  we  replace  the  norm  with  any 
other  norm  (with  some  other  minor  modifications),  especially  ||  •  ||s  l  or  ||  •  |[2,  then  a 
naive  fixed  point  iteration  algorithm  still  exists.  In  addition,  as  we  did  in  Corollary 
3.2,  we  can  express  other  norms  on  the  error  vector  in  terms  of  ||  •  ||Sii  and  ||  •  ||2-  We 
expect  the  norm  ||  •  ||Sii  would  yield  the  tightest  performance  bounds.  Unfortunately, 
the  major  problem  is  that  in  these  cases,  the  function  fs(i 7)  do  not  admit  an  obvious 
polynomial  time  algorithm  to  compute.  It  is  very  likely  the  corresponding  norm 
maximization  defining  /s(? 7)  for  ||  •  ||Sji  and  ||  •  j|2  are  NP  hard  [4]. 


5.  Probabilistic  Behavior  of  w<>(Q,s).  In  [26],  we  defined  the  fi-constrained 
minimal  singular  value  (fi-CMSV)  as  a  goodness  measure  of  the  sensing  matrix  and 
established  performance  bounds  using  l\— CMSV.  For  comparison,  we  include  the 
definition  below: 
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Definition  1.  For  any  s  £  [l,n]  and  matrix  A  £  Wnxn ,  define  the  l\- constrained 
minimal  singular  value  (abbreviated  as  t\-CMSV)  of  A  by 


Ps{A ) 


Z\ 


min 

ll*ll?/ll*ll!<s 


N|2  ' 


(5.1) 


Despite  the  seeming  resemblance  of  the  definitions  between  u)0(Q,  s),  especially 
u>2  [A,  s),  and  ps(A),  the  difference  in  the  too  norm  and  the  t 2  norm  has  important  im¬ 
plications.  As  shown  in  Theorem  4.2,  the  too  norm  enables  the  design  of  optimization 
procedures  with  nice  convergence  properties  to  efficiently  compute  lj<>(Q,s).  On  the 
other  hand,  the  I'i-CMSV  yields  tight  performance  bounds  at  least  for  a  large  class 
of  random  sensing  matrices,  as  we  will  see  in  Theorem  5.2. 

However,  there  are  some  interesting  connections  among  these  quantities,  as  shown 
in  the  following  proposition.  These  connections  allow  us  the  analyze  the  probabilistic 
behavior  of  oj0(Q,s)  using  the  results  for  ps(A)  established  in  [26]. 

Proposition  5.1. 


Vs^Uoo^A.s)  >  UI2(A,  s)  >  ps2  (A). 


(5.2) 


Proof.  For  any  2:  such  that  Hzlloo  =  1  and  ||z||i  <  s,  we  have 

zAtAz  <  ^2  \zi\\{AT Az)i\ 

i 

<  HzIMIA^IU 

<  s\\ATAz\\oo- 

Taking  the  minimum  over  {z  :  H^Hoo  =  1,  ||z||i  <  s}  yields 

wf(A,  s)  <  swoo(ATA,  s). 

Note  that  HzUi/HzIloo  <  s  implies  || z || x  <  sHzHoo  <  s||z||2,  or  equivalently, 
iz  :  ll«l|i/ll«l|oo  <  s}  ^  {z  :  ||z||i/||z||2  <  s}. 


As  a  consequence,  we  have 


w2(A,  s)  = 


> 


> 


U*h  N|2 

Nli/ll*IU<.  ||z||2  Halloo 

\\Azh 

mm  - 

l|*|b/|l*||oo<S  ||z||2 

II  Az||2 

mm  - 

Il*l|l/ll*ll2<«  ||Z||2 


=  Ps2(- 4), 


(5.3) 


(5.4) 


(5.5) 


(5.6) 


where  the  first  inequality  is  due  to  ||z||2  >  Hzljoo,  and  the  second  inequality  is  because 
the  minimization  is  taken  over  a  larger  set.  □ 

As  a  consequence  of  the  theorem  we  established  in  [26]  and  include  below,  we 
derive  a  condition  on  the  number  of  measurements  to  get  w<>(Q,  s)  bounded  away  from 
zero  with  high  probability  for  sensing  matrices  with  i.i.d.  subgaussian  and  isotropic 
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rows.  Note  that  a  random  vector  X  £  M"  is  called  isotropic  and  subgaussian  with 
constant  L  if  E|  ( X,u )  |2  =  \\u\\\  and  P(|  ( X,u )  \  >  t)  <  2  exp(— f2/(L||  w||2))  hold  for 
any  u  £  K". 

Theorem  5.2.  [26]  Let  the  rows  of  the  scaled  sensing  matrix  y/rnA  be  i.i.d.  sub¬ 
gaussian  and  isotropic  random  vectors  with  numerical  constant  L.  Then  there  exist 
constants  ci  and  c2  such  that  for  any  e  >  0  and  m  >  1  satisfying 


L2s\ogn 
m  >  ci  , 

e- 

(5.7) 

we  have 

E|1  -  ps(A) |  <  e, 

(5.8) 

and 

P{1  —  e  <  ps(A)  <  1  +  e}  >  1  —  exp(— c2e2m/i4). 

(5.9) 

Theorem  5.3.  Under  the  assumptions  and  notations  of  Theorem  5.2,  there  exist 
constants  ci  and  c2  such  that  for  any  e  >  0  and  m  >  1  satisfying 

L2s2  log?r 
m>  ci  „  , 

e- 

(5.10) 

we  have 

E  w2(A,  s)  >  1  —  e, 

P{w2(A,  s)  >  1  —  e}  >  1  —  exp(— c2e2m), 

(5.11) 

(5.12) 

and 

E  iooc(AT A,  s)  >  (1_e)  , 
s 

(5.13) 

P  |woo(A,  s)  >  - - —  |  >  1  —  exp(— c2e2m). 

(5.14) 

Sensing  matrices  with  i.i.d.  subgaussian  and  isotropic  rows  include  the  Gaussian 
ensemble,  and  the  Bernoulli  ensemble,  as  well  as  the  normalized  volume  measure  on 
various  convex  symmetric  bodies,  for  example,  the  unit  balls  of  for  2  <  p  <  oo  [23]. 
In  equations  (5.13)  and  (5.14),  the  extra  s  in  the  lower  bound  of  u00(AT A,  s)  would 
contribute  an  s  factor  in  the  bounds  of  Theorem  3.4.  It  plays  the  same  role  as  the 
extra  Vk  factor  in  the  error  bounds  for  the  Dantzig  selector  and  the  LASSO  estimator 
in  terms  of  the  RIC  and  the  t\— CMSV  [10,26]. 

The  measurement  bound  (5.10)  implies  that  the  algorithms  for  verifying  w0  >  0 
and  for  computing  w<>  work  for  s  at  least  up  to  the  order  yjm/  logn.  The  order 
\Jml  log?r  is  complementary  to  the  ypm  upper  bound  in  Proposition  4.1. 

Note  that  Theorem  5.2  implies  that  the  following  program: 

max  ||z||2  s.t.  Az  =  0,  ||z||i  <  1,  (5.15) 

Z 

verifies  the  sufficient  condition  for  exact  i\  recovery  for  s  up  to  the  order  m/logn, 
at  least  for  subgaussian  and  isotropic  random  sensing  matrices.  Unfortunately,  this 
program  is  NP  hard  and  hence  not  tractable. 
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6.  Numerical  Experiments.  In  this  section,  we  provide  implementation  de¬ 
tails  and  numerically  assess  the  performance  of  the  algorithms  for  solving  (4.6)  using 
the  naive  fixed  point  iteration.  The  numerical  implementation  and  performance  of 
(4.1)  were  previously  reported  in  [26]  and  hence  are  omitted  here.  All  the  numerical 
experiments  in  this  section  were  conducted  on  a  desktop  computer  with  a  Pentium  D 
CPU@3.40GHz,  2GB  RAM,  and  Windows  XP  operating  system,  and  the  computa¬ 
tions  were  running  single-core. 

Recall  that  the  optimization  defining  fSti(r])  is 

min Zi  s.t.  ||Qz||0  <  1,  ||z||i  <  srj.  (6.1) 

Depending  on  whether  o  =  l,oo,  or  2,  (6.1)  is  solved  using  either  linear  programs 
or  second-order  cone  programs.  For  example,  when  o  =  oo,  we  have  the  following 
corresponding  linear  programs: 


min  ef  0T 


z 

u 


s.t. 


'go' 

1 

g  o 

r  _ 

1 

i  -i 

z 

< 

0 

-i  -i 

u 

0 

0T  1T 

srj 

(6.2) 


These  linear  programs  are  implemented  using  the  primal-dual  algorithm  outlined  in 
Chapter  11  of  [5].  The  algorithm  finds  the  optimal  solution  together  with  optimal 
dual  vectors  by  solving  the  Karush-Kuhn- Tucker  condition  using  linearization.  The 
major  computation  is  spent  in  solving  linear  systems  of  equations  with  positive  definite 
coefficient  matrices.  When  o  =  2,  we  rewrite  (6.1)  as  the  following  second-order  cone 
programs 


min  ef  0T 


z 

u 


s.t. 


1 

2 


Q  O 


z 

u 


2 

2 


1<0 


I  -I  ' 

r  ^ 

0 

-I  I 

Z 

< 

0 

1 - 

o 

4 

M 

4 

i _ 

u 

srj 

(6.3) 


We  use  the  log-barrier  algorithm  described  in  Chapter  11  of  [5]  to  solve  (6.3).  In¬ 
terested  readers  are  encouraged  to  refer  to  [6]  for  a  concise  exposition  of  the  general 
primal-dual  and  log-barrier  algorithms  and  implementation  details  for  similar  linear 
programs  and  second-order  cone  programs. 

We  test  the  algorithms  on  Bernoulli,  Gaussian,  and  Hadamard  matrices  of  differ¬ 
ent  sizes.  The  entries  of  Bernoulli  and  Gaussian  matrices  are  randomly  generated  from 
the  classical  Bernoulli  distribution  with  equal  probability  and  the  standard  Gaussian 
distribution,  respectively.  For  Hadamard  matrices,  first  a  square  Hadamard  matrix 
of  size  n  (n  is  a  power  of  2)  is  generated,  then  its  rows  are  randomly  permuted  and  its 
first  m  rows  are  taken  as  an  m  x  n  sensing  matrix.  All  m  x  n  matrices  are  normalized 
to  have  columns  of  unit  length. 
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We  compare  our  recovery  error  bounds  based  on  w<>  with  those  based  on  the  RIC. 
Combining  Corollary  3.2  and  Theorem  3.4,  we  have  for  the  Basis  Pursuit 


and  for  the  Dantzig  selector 


„  /  2\/2k 

\x  ~  *  2  <  (A  0,,£, 

w2(A,  2k) 

2  V2k 


\x  -  x\\2  < 


M- 


Uoo(ATA,2k) 

For  comparison,  the  two  RIC  bounds  are 

n  -  ii  ^  4a/1  +  62k(A) 

"  1  -  (1  +  V2)62k(A) 
for  the  Basis  Pursuit,  assuming  S2k(A)  <  y/2  —  1  [7],  and 

ii  ~  n  4 \/k 


(6.4) 


(6.5) 


(6.6) 


(6.7) 


for  the  Dantzig  selector,  assuming  52k{A)+8^{A)  <  1  [10].  Without  loss  of  generality, 
we  set  £  =  1  and  p=  1. 

The  RIC  is  computed  using  Monte  Carlo  simulations.  More  explicitly,  for  52k(A), 
we  randomly  take  1000  sub-matrices  of  A  £  Rmx“  of  size  to  x  2k,  compute  the 
maximal  and  minimal  singular  values  (J\  and  a2k,  and  approximate  S2k{A)  using  the 
maximum  of  max(crj  —  1,1  —  cr|fe)  among  all  sampled  sub-matrices.  Obviously,  the 
approximated  RIC  is  always  smaller  than  or  equal  to  the  exact  RIC.  As  a  consequence, 
the  performance  bounds  based  on  the  exact  RIC  are  worse  than  those  based  on  the 
approximated  RIC.  Therefore,  in  cases  where  our  based  bounds  are  better  (tighter, 
smaller)  than  the  approximated  RIC  bounds,  they  are  even  better  than  the  exact  RIC 
bounds. 

In  Tables  6.1,  6.2,  and  6.3,  we  compare  the  error  bounds  (6.4)  and  (6.6)  for  the 
Basis  Pursuit  algorithm.  In  the  tables,  we  also  include  s*  computed  by  (4.2),  and 
A;*  =  [s*/2j,  i.e.,  the  maximal  sparsity  level  such  that  the  sufficient  and  necessary 
condition  (2.7)  holds.  The  number  of  measurements  is  taken  as  m  =  [pn\,p  = 
0.2, 0.3, . . .  ,0.8.  Note  the  blanks  mean  that  the  corresponding  bounds  are  not  valid. 
For  the  Bernoulli  and  Gaussian  matrices,  the  RIC  bounds  work  only  for  k  <2,  even 
with  to  =  [0.8nJ,  while  the  uj2(A,2k)  bounds  work  up  until  k  =  9.  Both  bounds  are 
better  for  Hadamard  matrices.  For  example,  when  in  =  0.5n,  the  RIC  bounds  are 
valid  for  k  <  3,  and  our  bounds  hold  for  k  <  5.  In  all  cases  for  n  =  256,  our  bounds 
are  smaller  than  the  RIC  bounds. 

We  next  compare  the  error  bounds  (6.5)  and  (6.7)  for  the  Dantzig  selector.  For 
the  Bernoulli  and  Gaussian  matrices,  our  bounds  work  for  wider  ranges  of  (fc,  in) 
pairs  and  are  tighter  in  all  tested  cases.  For  the  Hadamard  matrices,  the  RIC  bounds 
are  better,  starting  from  k  >  5  or  6.  We  expect  that  this  indicates  a  general  trend, 
namely,  when  k  is  relatively  small,  the  ui  based  bounds  are  better,  while  when  k  is 
large,  the  RIC  bounds  are  tighter.  This  was  suggested  by  the  probabilistic  analysis 
of  u>  in  Section  5.  The  reason  is  that  when  k  is  relatively  small,  both  the  relaxation 
|| a; || i  <  2/c||;c||00  on  the  sufficient  and  necessary  condition  (2.7)  and  the  relaxation 
II®  —  a? 1 1 2  <  V2k\\x  —  alloc  are  sufficiently  tight. 
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Table  6.1:  Comparison  of  the  w2  based  bounds  and  the  RIC  based  bounds  on  the  £ 2 
norms  of  the  errors  of  the  Basis  Pursuit  algorithm  for  a  Bernoulli  matrix  with  leading 
dimension  n  =  256. 


m 

51 

77 

102 

128 

154 

179 

205 

s* 

4.6 

6.1 

7.4 

9.6 

12.1 

15.2 

19.3 

k 

fc* 

2 

3 

3 

4 

6 

7 

9 

1 

ui  bd 
ric  bd 

4.2 

3.8 

3.5 

23.7 

3.4 

16.1 

3.3 

13.2 

3.2 

10.6 

3.2 

11.9 

2 

u 1  bd 
ric  bd 

31.4 

12.2 

9.0 

7.4 

6.5 

6.0 

72.1 

5.6 

192.2 

3 

uj  bd 
ric  bd 

252.0 

30.9 

16.8 

12.0 

10.1 

8.9 

4 

u 1  bd 
ric  bd 

52.3 

23.4 

16.5 

13.6 

5 

u>  bd 
ric  bd 

57.0 

28.6 

20.1 

6 

ui  bd 
ric  bd 

1256.6 

53.6 

30.8 

7 

u>  bd 
ric  bd 

161.6 

50.6 

8 

ui  bd 
ric  bd 

93.1 

9 

ui  bd 
ric  bd 

258.7 

In  Table  6.7  we  present  the  execution  times  for  computing  different  ui.  For  random 
matrices  with  leading  dimension  n  =  256,  the  algorithm  generally  takes  1  to  3  minutes 
to  compute  either  w2(H,  s)  or  oj00{AtA1  s ). 

In  the  last  set  of  experiments,  we  compute  w2(H,  2k)  and  Woo(HTH,  2fc)  for  a 
Gaussian  matrix  and  a  Hadamard  matrix,  respectively,  with  leading  dimension  n  = 
512.  The  row  dimensions  of  the  sensing  matrices  range  over  m  =  [pii\  with  p  = 
0.2, 0.3, . . . ,  0.8.  In  Figure  6.1,  we  compare  the  £2  norm  error  bounds  of  the  Basis 
Pursuit  using  ui2{A,  2k)  and  the  RIC.  The  color  indicates  the  values  of  the  error 
bounds.  We  remove  all  bounds  that  are  greater  than  50  or  are  not  valid.  Hence,  all 
white  areas  indicate  that  the  bounds  corresponding  to  (fc,  m)  pairs  that  are  too  large 
or  not  valid.  The  left  sub- figure  is  based  on  w2(H,  2k)  and  the  right  sub- figure  is  based 
on  the  RIC.  We  observe  that  the  ui2{A,2k)  based  bounds  apply  to  a  wider  range  of 
(fc,  m)  pairs. 

In  Figure  6.2,  we  conduct  the  same  experiment  as  in  Figure  6.1  for  a  Hadamard 
matrix  and  the  Dantzig  selector.  We  observe  that  for  the  Hadamard  matrix,  the  RIC 
gives  better  performance  bounds.  This  result  coincides  with  the  one  we  obtained  in 
Table  6.5. 

The  average  time  for  computing  each  w2(H,  2k)  and  oj(x>{AtA ,  2k)  was  around  15 
minutes. 

7.  Conclusions.  In  this  paper,  we  analyzed  the  performance  of  £\  sparse  sig¬ 
nal  recovery  algorithms  using  the  £oo  norm  of  the  errors  as  a  performance  criterion. 
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Fig.  6.1:  UJ2(A,  2k)  based  bounds  v.s.  RIC  based  bounds  on  the  £2  norms  of  the  errors 
for  a  Gaussian  matrix  with  leading  dimension  n  =  512.  Left:  W2(^4,  2k)  based  bounds; 
Right:  RIC  based  bounds. 


Fig.  6.2:  oj00(ATA,2k)  based  bounds  v.s.  RIC  based  bounds  on  the  t 2  norms  of  the 
errors  for  a  Hadamard  matrix  with  leading  dimension  n  =  512.  Left:  W2(^b  2k)  based 
bounds;  Right:  RIC  based  bounds 
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Table  6.2:  Comparison  of  the  w2  based  bounds  and  the  RIC  based  bounds  on  the 
£ 2  norms  of  the  errors  of  the  Basis  Pursuit  algorithm  for  a  Hadamard  matrix  with 
leading  dimension  n  =  256. 


m 

51 

77 

102 

128 

154 

179 

205 

s* 

5.4 

7.1 

9.1 

11.4 

14.0 

18.4 

25.3 

k 

k * 

2 

3 

4 

5 

6 

9 

12 

1 

u>  bd 

3.8 

3.5 

3.3 

3.2 

3.1 

3.0 

3.0 

ric  bd 

46.6 

13.2 

9.2 

9.4 

8.3 

6.2 

5.2 

o 

u>  bd 

13.7 

8.4 

6.7 

5.9 

5.4 

4.9 

4.6 

z 

ric  bd 

46.6 

24.2 

15.3 

8.6 

7.1 

q 

u>  bd 

30.9 

14.0 

10.1 

8.4 

7.1 

6.3 

O 

ric  bd 

1356.6 

25.4 

10.3 

OO 

oo 

A 

u>  bd 

47.4 

18.9 

13.2 

9.9 

8.1 

4 

ric  bd 

40.0 

14.0 

10.2 

u>  bd 

51.5 

22.6 

13.8 

10.3 

0 

ric  bd 

18.8 

11.6 

6 

u>  bd 

50.8 

20.1 

13.1 

ric  bd 

42.5 

15.9 

7 

u>  bd 

31.8 

16.7 

ric  bd 

94.2 

19.7 

O 

u>  bd 

63.5 

21.7 

O 

ric  bd 

1000.0 

24.6 

Q 

u>  bd 

449.8 

29.4 

ric  bd 

39.1 

10 

u>  bd 

42.8 

ric  bd 

35.6 

11 

w  bd 

72.7 

ric  bd 

134.1 

12 

u>  bd 
ric  bd 

195.1 

We  expressed  other  popular  performance  criteria  in  terms  of  the  norm.  A  family 
of  goodness  measures  of  the  sensing  matrices  was  defined  using  optimization  proce¬ 
dures.  We  used  these  goodness  measures  to  derive  upper  bounds  on  the  t ^  norms  of 
the  reconstruction  errors  for  the  Basis  Pursuit,  the  Dantzig  selector,  and  the  LASSO 
estimator.  Polynomial-time  algorithms  with  established  convergence  properties  were 
implemented  to  efficiently  solve  the  optimization  procedures  defining  the  goodness 
measures.  We  expect  that  these  goodness  measures  will  be  useful  in  comparing  dif¬ 
ferent  sensing  systems  and  recovery  algorithms,  as  well  as  in  designing  optimal  sens¬ 
ing  matrices.  In  future  work,  we  will  use  these  computable  performance  bounds  to 
optimally  design  /c— space  sample  trajectories  for  MRI  and  to  optimally  design  trans¬ 
mitting  waveforms  for  compressive  sensing  radar. 

8.  Appendix:  Proofs. 

8.1.  Proof  of  Proposition  3.1.  Proof.  [Proof  of  Proposition  3.1]  Suppose 
S  =  supp(:r)  and  |Sj  =  ||a:||o  =  k.  Define  the  error  vector  h  =  x  —  x.  For  any  vector 
zeR"  and  any  index  set  S  C  {1, . . .  ,n},  we  use  zs  €  to  represent  the  vector 
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Table  6.3:  Comparison  of  the  w2  based  bounds  and  the  RIC  based  bounds  on  the  1 2 
norms  of  the  errors  of  the  Basis  Pursuit  algorithm  for  a  Gaussian  matrix  with  leading 
dimension  n  =  256. 


m 

51 

77 

102 

128 

154 

179 

205 

s* 

4.6 

6.2 

8.1 

9.9 

12.5 

15.6 

20.0 

k 

k * 

2 

3 

4 

4 

6 

7 

10 

1 

u>  bd 
ric  bd 

4.3 

3.7 

3.5 

26.0 

3.4 

14.2 

3.3 

10.0 

3.2 

10.9 

3.2 

12.1 

2 

id  bd 
ric  bd 

34.3 

12.3 

8.3 

7.0 

6.4 

5.9 

47.1 

5.6 

27.6 

3 

id  bd 
ric  bd 

197.4 

23.4 

14.5 

11.6 

9.8 

8.9 

4 

u>  bd 
ric  bd 

1036.6 

39.6 

21.7 

15.9 

13.4 

5 

id  bd 
ric  bd 

49.3 

26.4 

20.0 

6 

id  bd 
ric  bd 

284.2 

48.8 

31.2 

7 

u>  bd 
ric  bd 

129.1 

48.1 

8 

id  bd 
ric  bd 

185.5 

9 

id  bd 
ric  bd 

9640.3 

whose  elements  are  those  of  z  indicated  by  S. 

We  first  deal  with  the  Basis  Pursuit  and  the  Dantzig  selector.  As  observed  by 
Candes  in  [7],  the  fact  that  ||a:||i  =  ||*  +  h\\i  is  the  minimum  among  all  zs  satisfying 
the  constraints  in  (2.2)  and  (2.3),  together  with  the  fact  that  the  true  signal  x  satisfies 
the  constraints  as  required  by  the  conditions  imposed  on  the  noise  in  Proposition  3.1, 
imply  that  ||^sc||i  cannot  be  very  large.  To  see  this,  note  that 

11*11 1  >  II*  +  kill 

=  ^  \xi  +  fr»l  +  I®*  +  fril 

ies  iesc 

>  ||*s||i  -  ||frs||i  +  \\hs4i 

=  ||*||i  -  llfrslli  +  ||frs=||i.  (8.1) 

Therefore,  we  obtain  ||/is||i  >  ||hs<=||i,  which  leads  to 

2||hs||i>||hs||i  +  ||frs-=||1  =  ||fr||1.  (8-2) 


We  now  turn  to  the  LASSO  estimator(2.4).  We  use  the  proof  technique  in  [8]  (see 
also  [3]).  Since  the  noise  w  satisfies  ||ATiu||(X)  <  ac/z  for  some  small  k  >  0,  and  x  is  a 
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Table  6.4:  Comparison  of  the  based  bounds  and  the  RIC  based  bounds  on  the  £2 
norms  of  the  errors  of  the  Dantzig  selector  algorithm  for  the  Bernoulli  matrix  used  in 
Table  6.1. 


m 

51 

77 

102 

128 

154 

179 

205 

s * 

4.6 

6.1 

7.4 

9.6 

12.1 

15.2 

19.3 

k 

2 

3 

3 

4 

6 

7 

9 

1 

u>  bd 
ric  bd 

6.0 

5.4 

46.3 

4.8 

17.4 

4.4 

12.1 

4.2 

11.2 

4.1 

10.3 

4.1 

8.6 

2 

UJ  bd 
ric  bd 

102.8 

38.4 

29.0 

18.5 

14.1 

12.8 

47.2 

11.9 

22.5 

3 

ui  bd 
ric  bd 

1477.2 

170.2 

81.2 

57.0 

41.1 

32.6 

4 

uj  bd 
ric  bd 

522.7 

194.6 

128.9 

89.0 

5 

uj  bd 
ric  bd 

768.7 

323.6 

203.2 

6 

uj  bd 
ric  bd 

24974.0 

888.7 

489.0 

7 

uj  bd 
ric  bd 

3417.3 

1006.9 

8 

uj  bd 
ric  bd 

2740.0 

9 

uj  bd 
ric  bd 

10196.9 

solution  to  (2.4),  we  have 

\\\A^  ~  ylll  +  mII*IIi  <  \\\Ax-y\\l  +  n\\xh- 

Consequently,  substituting  y  =  Ax  +  w  yields 

Mll*l|i  <  7)  II  Ax  ?/|| 2  ^\\Ax  -  y\\l  +  y\\x\h 

=  \\\wWl-  \\\A(x  -  x)  -  w\\2  +  v\\x\\l 
=  \\\wWl  -  \\\A{x  -  x)\\l 

+  (A{x-x),w)  -  ^||w||2  +  mII*IIi 
<  (A(x-x),w)  +/x||a:||i 
=  (x  -  x,  ATw )  +  ^11® Ill- 

Using  the  Cauchy-Swcharz  type  inequality,  we  get 

mII*IIi  <  II*  -  *||i||-4T,w||oo  +  mII*IIi 

=  KyWhlh  +y\\x\\u 
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Table  6.5:  Comparison  of  the  ui0 0  based  bounds  and  the  RIC  based  bounds  on  the  l2 
norms  of  the  errors  of  the  Dantzig  selector  algorithm  for  the  Hadamard  matrix  used 


in  Table  6.2. 


m 

51 

77 

102 

128 

154 

179 

205 

s * 

5.2 

6.9 

9.1 

12.1 

14.4 

18.3 

25.2 

k 

fc* 

2 

3 

4 

6 

7 

9 

12 

1 

u>  bd 

4.8 

4.0 

3.8 

3.4 

3.4 

3.2 

3.1 

ric  bd 

15.6 

9.3 

7.0 

6.3 

5.8 

5.1 

o 

ui  bd 

50.9 

16.2 

10.1 

7.1 

7.0 

6.1 

5.3 

Z 

ric  bd 

45.3 

16.6 

13.7 

10.6 

OO 

oo 

o 

ui  bd 

108.2 

30.7 

14.3 

13.9 

10.0 

8.0 

o 

ric  bd 

1016.4 

29.9 

24.9 

15.8 

12.5 

A 

u>  bd 

150.7 

35.3 

29.3 

16.8 

11.7 

4 

ric  bd 

126.4 

38.7 

24.2 

16.6 

u>  bd 

108.5 

64.2 

31.4 

17.3 

0 

ric  bd 

187.3 

30.0 

22.1 

6 

ui  bd 

3168.9 

171.5 

59.7 

25.3 

ric  bd 

112.0 

53.1 

26.8 

7 

u>  bd 

1499.5 

116.3 

38.8 

ric  bd 

411.7 

71.3 

34.7 

Q 

ui  bd 

265.3 

61.4 

O 

ric  bd 

95.4 

47.6 

Q 

ui  bd 

2394.0 

96.0 

ric  bd 

198.7 

61.9 

10 

u>  bd 

157.4 

ric  bd 

82.9 

11 

ui  bd 

296.4 

ric  bd 

130.3 

12 

ui  bd 

898.2 

ric  bd 

201.2 

which  leads  to 

INIl  <  K\\h\\!  +  ||*||l. 

Therefore,  similar  to  the  argument  in  (8.1),  we  have 

IMIi 

>  H&lli  -  K\\h\\i 

=  \\x  +  hSo  +  hs\\i  —  k  (||hs.  +  /ig ||i) 

>  ||a;  +  hSc  || i  -  ||hs||i  -  «(||/is<=||i  +  \\hs\\i) 

=  ll*l|i  +  (1  -  «)ll^s«=||i  -  (1  +  «0IIM|i, 

where  S  =  supp(®).  Consequently,  we  have 

IIMi  >^11^111- 
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Table  6.6:  Comparison  of  the  u> ^  based  bounds  and  the  RIC  based  bounds  on  the  £2 
norms  of  the  errors  of  the  Dantzig  selector  algorithm  for  the  Gaussian  matrix  used  in 


Table  6.3. 


m 

51 

77 

102 

128 

154 

179 

205 

s* 

4.6 

6.2 

8.1 

9.9 

12.5 

15.6 

20.0 

k 

/c* 

2 

3 

4 

4 

6 

7 

10 

1 

w  bcl 
ric  bd 

6.5 

5.1 

30.0 

4.8 

18.0 

4.3 

14.6 

4.2 

9.7 

4.0 

9.3 

3.9 

9.1 

2 

to  bd 
ric  bd 

119.4 

37.8 

22.5 

17.6 

14.1 

91.5 

12.7 

44.4 

11.4 

23.5 

3 

to  bd 
ric  bd 

1216.7 

120.7 

67.3 

53.6 

38.7 

36.4 

2546.6 

4 

to  bd 
ric  bd 

4515.9 

318.2 

168.4 

115.8 

109.0 

5 

to  bd 
ric  bd 

663.6 

292.4 

247.8 

6 

to  bd 
ric  bd 

5231.4 

764.3 

453.5 

7 

to  bd 
ric  bd 

2646.4 

1087.7 

8 

ui  bd 
ric  bd 

2450.5 

9 

ui  bd 
ric  bd 

6759.0 

Therefore,  similar  to  (8.2),  we  obtain 


1  —  K  i  —  K 

1+Kl-K 


1  —  K  . 
1  —  K 


Mb 


> 


1  —  K  1 

=  INi- 


^s°||i  T  t  ~||^s||i 

1  ~  K 


(8.3) 


8.2.  Proof  of  Theorem  4.2.  Proof. 

1.  Since  in  the  optimization  problem  defining  fs,i(i ?),  the  objective  function  z.i 
is  continuous,  and  the  constraint  correspondence 

C(rj)  :  [0, 00)  -»  K” 

{z--\\Qz\\<><l,\\z\\1<sri}  (8.4) 

is  compact- valued  and  continuous  (both  upper  and  lower  hemicontinuous) , 
according  to  Berge’s  Maximum  Theorem  [2],  the  optimal  value  function  fs,i{v) 
is  continuous.  The  continuity  of  fs(T])  follows  from  that  finite  maximization 
preserves  the  continuity. 

2.  To  show  the  strict  increasing  property,  suppose  0  <  ??i  <  7/ 2  and  the  dual 
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Table  6.7:  Time  in  seconds  taken  to  compute  w2 {A,  •)  and  w00(ATA,  •)  for  Bernoulli, 
Hadamard,  and  Gaussian  matrices 


k 

type 

m 

51 

77 

102 

128 

154 

179 

205 

1 

Bernoulli 

^00 

118 

75 

84 

81 

133 

84 

87 

65 

133 

63 

174 

144 

128 

151 

Hadamard 

0J2 

^00 

84 

57 

82 

55 

82 

58 

82 

58 

80 

58 

79 

58 

79 

57 

Gaussian 

C02 

^00 

82 

69 

84 

65 

212 

72 

106 

102 

156 

81 

185 

104 

104 

72 

3 

Bernoulli 

0J2 

^00 

155 

300 

96 

228 

95 

190 

97 

125 

97 

135 

131 

196 

Hadamard 

0J2 

^00 

91 

84 

88 

83 

87 

77 

88 

92 

74 

102 

72 

70 

Gaussian 

0J2 

^00 

134 

137 

168 

142 

115 

125 

95 

165 

96 

145 

100 

105 

5 

Bernoulli 

CJ2 

^00 

97 

156 

111 

81 

97 

107 

Hadamard 

<^2 

^00 

87 

75 

85 

74 

85 

75 

81 

75 

Gaussian 

0J2 

^00 

98 

105 

96 

193 

7 

Bernoulli 

0J2 

^00 

164 

178 

104 

85 

Hadamard 

UJ2 

^00 

134 

82 

71 

77 

65 

Gaussian 

W2 

^00 

106 

105 

193 

variable  A2  achieves  fs, 1(1)2)  in  (4.10).  Then  we  have 

fs,i(v  1)  <  S1)l\\ei  ~  QTK\\oo  +  ||A2||o 
<  S1)2\\ei  —  Q 7  A2II00  +  ||A2||: 

=  fs,i(v  2)-  (8-5) 

The  case  for  1)1  =  0  is  proved  by  continuity,  and  the  strict  increasing  of  fs(rj) 
follows  immediately. 

3.  The  concavity  of  fs,i(i))  follows  from  the  dual  representation  (4.10)  and  the 
fact  that  fs,i(i))  is  the  minimization  of  a  function  of  variables  77  and  A,  and 
when  A,  the  variable  to  be  minimized,  is  fixed,  the  function  is  linear  in  77. 

4.  Next  we  show  that  when  77  >  0  is  sufficiently  small  fs(i))  >  srj.  Taking 
z  =  srjei ,  we  have  ||z||i  =  si)  and  z*  =  srj  >  i)  (recall  s  £  (1, 00)).  In  addition, 
when  0  <  77  <  l/(s||Qj||<>),  we  also  have  ||Qz||0  <  1.  Therefore,  for  sufficiently 
small  77,  we  have  fs,i(i))  >  si)  >  i).  Clearly,  fs(rj)  =  max.;  fa,i(i))  >  si)  >  i)  for 
such  i). 

Recall  that 

—  =  maxmin  ||ej  —  QTAj||00.  (8-6) 

s*  *  a. 
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Suppose  A*  is  the  optimal  solution  for  each  min,^  ||e*  —  Q^’AjUtx,.  For  each  i, 
we  then  have 

->||e,:-gTAl||00,  (8.7) 

S  % 

which  implies 

?)  =  min sry||ej  -  QTAi||00  +  ||A;||* 

<  si?||e.i  —  QtA*||00  +  || A* || J 

<fr?+||A*||:.  (8.8) 

As  a  consequence,  we  obtain 

fs(v)  =  ma xfsAri)  <  —  V  +  max  ||A*  ||*.  (8.9) 

i  s*  * 

Pick  p  £  (s/s*,  1).  Then,  we  have  the  following  when  p  >  max.;  ||A*||J/(p  — 
s/s*): 


fsAv)  <  pi  1,  i  =  1,  •  •  • ,  n,  and 
fs(v)  <  prp  (8-10) 

5.  We  first  show  the  existence  and  uniqueness  of  the  positive  fixed  points  for 

The  properties  1)  and  4)  imply  that  fs,i(p )  has  at  least  one  positive 
fixed  point.  (Interestingly,  2)  and  4)  also  imply  the  existence  of  a  positive 
fixed  point,  see  [28].)  To  prove  uniqueness,  suppose  there  are  two  fixed  points 
0  <  if  <  P2-  Pick  770  small  enough  such  that  fs,i{p o)  >  >  0  and  po  <  Vi- 

Then  jy*  =  Xp0  +  (1  —  A)?/]  for  some  A  €  (0, 1),  which  implies  that  fs,i(p i)  > 
A/s,i(r/0)  +  (1  -  A)/s> 4(7/3)  >  Ar/o  +  (1  -  A)^  =  Vi  due  to  the  concavity, 
contradicting  with  pi  =  fs,i{Vi)- 

The  set  of  positive  fixed  point  for  fs{p),  {p  £  (0, 00)  :  p  =  fs(p)  =  max;  //,;(??)}, 
is  a  subset  of  (Jf=1{?/  £  (0, 00)  :  p  =  fs,i(p)}  =  {??*}”=  i-  We  argue  that 

p*  =  max?/*  (8.11) 

i 

is  the  unique  positive  fixed  point  for  fs(p). 

We  proceed  to  show  that  if  is  a  fixed  point  of  fs(p).  Suppose  p*  is  a 
fixed  point  of  fs,i0{p),  then  it  suffices  to  show  that  fs(if)  =  max;  fs,i(p*)  = 
fs,i0(P*)-  ^  this  is  not  the  case,  there  exists  i\  f  i0  such  that  fs^ip*)  > 
fs,i0{jf)  =  P* ■  The  continuity  of  fspfp)  and  the  property  4)  imply  that  there 
exists  p  >  p*  with  fs,h{p)  =  p,  contradicting  with  the  definition  of  p*. 

To  show  the  uniqueness,  suppose  if  is  fixed  point  of  fSti1  (p)  satisfying  ?/*  < 
p* .  Then,  we  must  have  fs,i0(Vi)  >  /s,ii(i?i)  because  otherwise  the  continuity 
implies  the  existence  of  another  fixed  point  of  fs,i0(p)-  As  a  consequence, 
fs{Vi)  >  fs,i i(Vi)  =  V*  and  if  is  not  a  fixed  point  of  fs(p). 

6.  Next  we  show  p*  =  7*  ^  l/co<>(Q,  s).  We  first  prove  7*  >  p*  for  the  fixed 
point  p*  =  fs(p*)-  Suppose  z*  achieves  the  optimization  problem  defining 
fs(i f),  then  we  have 


V*  =  fs(p*)  =  \\z*\\oo,  IIQ^IIo  <  1)  and  ||z*||i  <  sp*. 


(8.12) 
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Since  112*111/112*1100  <  sr]*/r]*  <  s,  we  have 

*  11*11°°  >  *_ 
-  \\Qz*\\o  -  1 

If  rj*  <7*,  we  define  r/0  =  (77*  +  7*)/2  and 

n  $  I  I  ^  l|  OO 


2  =  argmaxz 


z  1 


s.t.  \\Qz\\0  <  1,  ||2||oo  >  Vo, 


s  2" 


P  = 


1 


(8.13) 

(8.14) 

(8.15) 


Suppose  2**  with  ||Q2**||0  =  1  achieves  the  optimum  of  the  optimization 
(4.6)  defining  7*  =  l/w<>(Q,  s).  Clearly,  ||2**||oo  =  7*  >  770 ,  which  implies 
2**  is  a  feasible  point  of  the  optimization  problem  (8.14)  defining  zc  and  p. 
As  a  consequence,  we  have 


p  >  . . !,  °°  >  1. 


(8.16) 


Fig.  8.1:  Illustration  of  the  proof  for  p  >  1. 


Actually  we  will  show  that  p  >  1.  If 
(*.e.,  112**11!  =  fl||2**||00) 
which  satisfies 


0  >  1.  If  || 2** 

||i  <  s  2**  |oo,  we  are  done. 

If  not 

illustrated  in 

Figure  8.1,  we  consider  £  = 

T]0  ** 

7*  ’ 

IIQ£II°  =  ^ 

<  1, 

(8.17) 

II  £11 00  =  Vo, 

and 

(8.18) 

ll£lli  =  svo- 

(8.19) 

To  get  £n  as  shown  in  Figure  8.1,  pick  the  component  of  £  with  the  smallest 
non-zero  absolute  value,  and  scale  that  component  by  a  small  positive  con¬ 
stant  less  than  1.  Because  s  >  1,  £  has  more  than  one  non-zero  components, 
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implying  ||£n||oo  will  remain  the  same.  If  the  scaling  constant  is  close  enough 
to  1,  ||Q£n||<>  will  remain  less  than  1  due  to  continuity.  But  the  good  news  is 
that  |||n||i  decreases,  and  hence  p  >  sJJ|nJJ”  becomes  greater  than  1. 

Now  we  proceed  to  obtain  a  contradiction  that  fs(v*)  >  V* -  If  ||zc||i  <  s  ‘  W*, 
then  it  is  a  feasible  point  of 

max  Halloo  s.t.  ||Qz||o  <  1,  ||z||i  <  s  ■  rf  ■  (8.20) 

Z 

As  a  consequence,  fs(i f)  >  ||2:c||oo  >  Vo  >  V* >  contradicting  with  rj*  is  a  fixed 
point  and  we  are  done.  If  this  is  not  the  case,  i.e.,  ||zc||i  >  s  ■  if,  we  define  a 
new  point 


with 


z 


n 


=  rz 


c 


(8.21) 


(8.22) 


Note  that  zn  is  a  feasible  point  of  the  optimization  problem  defining  fs{v*) 
since 


||Qzn||o  =  r||Qzc||0  <  1,  and  (8.23) 

||zn||1=r||zc||  1=s-if.  (8.24) 

Furthermore,  we  have 

INloo  =t||zc||oo  =prf-  (8-25) 

As  a  consequence,  we  obtain  a  contradiction 

fs(v*)  >  PV*  >  V*-  (8.26) 

Therefore,  for  the  fixed  point  rf ,  we  have  ?y*  =  7*  =  l/w0(Q,  s). 

7.  This  property  simply  follows  from  the  continuity,  the  uniqueness,  and  prop¬ 
erty  4). 

8.  We  use  contradiction  to  show  the  existence  of  pi(e)  in  8).  In  view  of  4),  we 
need  only  to  show  the  existence  of  such  a  pi(e)  that  works  for  tjl  <  rj  < 
(1  —  e)if  where  t]l  =  sup{??  :  fs(£)  >  s£,V 0  <  £  <  r/}.  Suppose  otherwise,  we 

then  construct  sequences  {v^YkLi  C  [vl,  (1  —  e)v*]  an(I  {Pi^lkLi  C  (l,oo) 
with 

lim  p ^  =  1, 

k—t  oo 

fs(v{k))<P{k)Vik)-  (8.27) 

Due  to  the  compactness  of  1  —  e)?7*] ,  there  must  exist  a  subsequence 
{J?(fe,)}gi  0f  {jj(fc)}  such  that  lim^oo  ifkl)  =  r/iim  for  some  r?iim  G  [vl,{  1  - 
f-)rf].  As  a  consequence  of  the  continuity  of  fs(i 7),  we  have 


fs(v lim)  =  linr  fs{i {kl))  <  lim  p{kl)  p(kl)  =  r?iim 

l— too  l— too 


(8.28) 
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Again  due  to  the  continuity  of  and  the  fact  that  fs(ji)  <  rj  for  r/  <  r]L, 
there  exists  r)c  £  [r]L,V lim]  such  that 


fs(Vc)=Vc  (8.29) 

contradicting  with  the  uniqueness  of  the  fixed  point  for  fsiv)-  The  existence 
of  P2(t)  can  be  proved  in  a  similar  manner. 
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